Enable CPU/XPU native and ipex path #1628

jiqing-feng · 2025-05-08T07:23:19Z

This PR enables ipex and other optimizations including:

ipex fused op
enable fp4 on cpu
enable has_rem on quantize/dequantize 4bit
Simple 8bit matmul so can make finetune faster on CPU

Also, it fixed the parameter patch for cpu.

It could pass all transformers tests

After this PR merged, I will update the installation guide.

@matthewdouglas @Titus-von-Koeller

Signed-off-by: jiqing-feng <[email protected]>

bitsandbytes/functional.py

Signed-off-by: jiqing-feng <[email protected]>

github-actions · 2025-05-08T18:05:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

bitsandbytes/autograd/_functions.py

bitsandbytes/backends/cpu/ops.py

bitsandbytes/utils.py

bitsandbytes/backends/cpu/ops.py

bitsandbytes/functional.py

bitsandbytes/nn/modules.py

jiqing-feng · 2025-05-09T08:16:24Z

I am cleaning the CPU and XPU tests, process 50%

Signed-off-by: jiqing-feng <[email protected]>

Devjiu · 2025-05-09T11:28:35Z

bitsandbytes/functional.py

+            quant_state.blocksize,
+            quant_state.shape,
+            quant_state.dtype,
+        )


Is there reason why this change can't be in bitsandbytes/backends/cpu/ops.py?

@Devjiu See here: https://huggingface.slack.com/archives/C07FCTCJKU0/p1746584773504889

bitsandbytes/autograd/_functions.py

Signed-off-by: jiqing-feng <[email protected]>

tests/test_functional.py

tests/test_linear4bit.py

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-05-13T06:36:10Z

Hi @matthewdouglas . Please check if there are anything missed before merging. We can discuss the tests in this issue #1637

Signed-off-by: jiqing-feng <[email protected]>

bitsandbytes/autograd/_functions.py

matthewdouglas · 2025-05-14T15:07:03Z

bitsandbytes/cextension.py

+    if not ipex_cpu:
+        logger.warning(
+            "The installed version of bitsandbytes was compiled without IPEX support. "
+            "You can install ipex by running `pip install intel_extension_for_pytorch`to get better performance if you use the Intel CPU.",
+        )


This seems like extra noise that we'd want to avoid.

Something to point out is that we still plan to ship libbitsandbytes_cpu in our wheels, so for most users, it's going to load a CPU, CUDA, or eventually ROCm or Metal library and we'll hit this logging line. At most we should really only raise this warning when:

We're on a platform with IPEX CPU support. My understanding is this is limited to Linux x86-64.

We expect the user to be using CPU, i.e. no CUDA, XPU, or MPS accelerators available.
On torch >= 2.6 we could just use torch.accelerator.is_available() and on older versions I think we can overlook privateuse1 backends like HPU or Ascend NPU.

There's some expectation of IPEX being beneficial. We don't want to prompt users to install it if e.g. it needs AVX512 or AMX support to be effective. This is something I can't speak to directly but defer to Intel folks to determine.

Any other thoughts @Titus-von-Koeller ?

You were right. I also agree that the log should only exist if no devices like cuda/xpu are available and the CPU is an Intel product.

I have changed it, please review again. Thanks!

bitsandbytes/autograd/_functions.py

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-05-16T01:56:00Z

Hi @matthewdouglas . You can see some CPU ops are implemented by pure python, which works for all devices. Could we move these pure python implementations to the default folder? As some of these ops could be reused by XPU.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-05-21T05:15:21Z

Hi @matthewdouglas . I moved pure pytorch ops to the default folder and it works well. The CIs are all passed. Please continue to review this PR.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng added 3 commits May 7, 2025 14:53

enable ipex

ba79025

Signed-off-by: jiqing-feng <[email protected]>

fix cpu 8bit quantization

958d75b

Signed-off-by: jiqing-feng <[email protected]>

fix int8 and nf4 cpu inference

f5c0b01

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng marked this pull request as ready for review May 8, 2025 07:25

add cpu fp4 and rem

7f2d8a8

Signed-off-by: jiqing-feng <[email protected]>

Devjiu reviewed May 8, 2025

View reviewed changes

bitsandbytes/functional.py Outdated Show resolved Hide resolved

jiqing-feng added 2 commits May 8, 2025 14:14

fix dequantize nf4 xpu

97d5bd1

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into ipex

5563c35

matthewdouglas requested changes May 8, 2025

View reviewed changes

matthewdouglas added Intel Cross Platform x64 CPU labels May 8, 2025

matthewdouglas self-assigned this May 8, 2025

jiqing-feng added 4 commits May 9, 2025 10:32

fix ipex op

7b72673

Signed-off-by: jiqing-feng <[email protected]>

fix dequantize nf4 name

52e32af

Signed-off-by: jiqing-feng <[email protected]>

fix dequantize nf4 ipex

fda3d70

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into ipex

5ce3296

Devjiu reviewed May 9, 2025

View reviewed changes

bitsandbytes/autograd/_functions.py Outdated Show resolved Hide resolved

jiqing-feng added 5 commits May 9, 2025 14:04

fix matmul8bitfp

f51678e

Signed-off-by: jiqing-feng <[email protected]>

enable cpu tests

7c9281c

Signed-off-by: jiqing-feng <[email protected]>

fix format

83cea6b

Signed-off-by: jiqing-feng <[email protected]>

fix quantize blockwise output shape

bc8723e

Signed-off-by: jiqing-feng <[email protected]>

fix quant_storage bf16 and gemv cpu

3c07023

Signed-off-by: jiqing-feng <[email protected]>

matthewdouglas reviewed May 9, 2025

View reviewed changes

tests/test_functional.py Outdated Show resolved Hide resolved

matthewdouglas reviewed May 9, 2025

View reviewed changes

tests/test_functional.py Outdated Show resolved Hide resolved

matthewdouglas added this to the v0.47.0 milestone May 9, 2025

matthewdouglas reviewed May 9, 2025

View reviewed changes

tests/test_linear4bit.py Show resolved Hide resolved

update ipex install guide

21cf8c1

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng mentioned this pull request May 13, 2025

CPU/XPU tests align #1637

Closed

jiqing-feng added 2 commits May 13, 2025 10:59

update install guide

a9e5c4a

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into ipex

1a77949

Egor-Krivov reviewed May 14, 2025

View reviewed changes

bitsandbytes/autograd/_functions.py Show resolved Hide resolved

matthewdouglas reviewed May 14, 2025

View reviewed changes

Merge branch 'main' into ipex

b0cd993

Egor-Krivov reviewed May 15, 2025

View reviewed changes

bitsandbytes/autograd/_functions.py Outdated Show resolved Hide resolved

jiqing-feng added 3 commits May 15, 2025 13:04

fix error log

539f5d4

Signed-off-by: jiqing-feng <[email protected]>

fix error lof

005afe0

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into ipex

4471ada

jiqing-feng added 2 commits May 16, 2025 09:53

update comment

cddeec6

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'bitsandbytes-foundation:main' into ipex

8492010

jiqing-feng changed the title ~~Enable ipex and other optimizations~~ Enable CPU/XPU native and ipex path May 21, 2025

jiqing-feng added 5 commits May 21, 2025 10:44

move torch op to default

25d01a4

Signed-off-by: jiqing-feng <[email protected]>

revert ipex check

8ff8947

Signed-off-by: jiqing-feng <[email protected]>

fix code tabledevice

82651f9

Signed-off-by: jiqing-feng <[email protected]>

fix code table device

413bba9

Signed-off-by: jiqing-feng <[email protected]>

fix xpu ops

cf8bc14

Signed-off-by: jiqing-feng <[email protected]>

This was referenced May 21, 2025

supports HPU double quant #1630

Merged

Wrong result 8bit blockwise quantization over float16 #1540

Open

jiqing-feng added 2 commits May 26, 2025 10:32

Merge branch 'main' into ipex

3183abb

Merge branch 'main' into ipex

9d03b00

matthewdouglas approved these changes May 28, 2025

View reviewed changes

matthewdouglas merged commit aaa71d7 into bitsandbytes-foundation:main May 28, 2025
41 checks passed

This was referenced Jun 2, 2025

Fix CI regression #1666

Merged

Add CPU + IPEX to nightly CI #1667

Merged

Enable CPU/XPU native and ipex path #1628

Enable CPU/XPU native and ipex path #1628

Uh oh!

Conversation

jiqing-feng commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiqing-feng commented May 9, 2025

Uh oh!

Devjiu May 9, 2025

Choose a reason for hiding this comment

Uh oh!

matthewdouglas May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiqing-feng commented May 13, 2025

Uh oh!

Uh oh!

matthewdouglas May 14, 2025

Choose a reason for hiding this comment

Uh oh!

jiqing-feng May 15, 2025

Choose a reason for hiding this comment

Uh oh!

jiqing-feng May 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jiqing-feng commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiqing-feng commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiqing-feng commented May 8, 2025 •

edited

Loading

jiqing-feng commented May 16, 2025 •

edited

Loading

jiqing-feng commented May 21, 2025 •

edited

Loading